Add a high-level API for creating repodata #271

dralley · 2021-05-28T02:47:07Z

This is a mostly-functional prototype of a more convenient (Python) API for writing repos that I threw together.

There's no obligation to merge this, but, is it something you would be interested in merging? I figure that 90% of this code is code that any consumer of the API would need to write / copy anyway (and I started with just the example code).

If the answer is yes, I can finish it off, write documentation and tests, etc. Otherwise we can just close.

dralley · 2021-05-28T02:52:22Z

Not sure why that test would be failing.

edit: Interesting, it passes on Fedora 33, and fails on Fedora 34

Did something change with libxml? Filed #272

kontura · 2021-06-02T13:32:15Z

yea the failing tests are unrelated, in addition to the libxml2 issue you described there is also a zchunk regression.

Regarding the PR it self, I personally would be fine with merging this. Some kind of a higher-level API that is in spirit closer to the createrepo_c binary seems nice to me.

Conan-Kudo · 2021-06-08T13:04:38Z

I really like the idea of a simpler high level API too.

dralley · 2021-06-08T13:21:45Z

Great, I'll keep working on it then.

dralley · 2021-08-07T14:56:20Z

I have some API design questions, see the TODO notes

src/python/createrepo_c/__init__.py

kontura · 2021-08-10T11:09:26Z

src/python/createrepo_c/__init__.py

+            if preserve_existing_metadata:
+                x = 0
+                while True:
+                    new_repodata_path = f"{self.repodata_path}_{x}"


I think createrepo_c binary and RepositoryWriter should be as consistent as possible. How about using something closer to --retain-old-md and --keep-all-metadata instead of preserve_existing_metadata?

Or do you prefer this behavior?

This is just what the example had been doing when I copied it over. I think there's a good argument to be made either way. It does seem a little silly to keep the metadata files around but throw away the repomd.xml that tied them together. If you keep the old repodata directories, it's fairly straightforwards to turn it back into a functional repo (assuming the packages are still there)

kontura · 2021-08-10T11:18:42Z

src/python/createrepo_c/__init__.py

+                    sqlite_metadata_info.writer.close()
+            else:
+                record = RepomdRecord(record_name, str(metadata_info.path))
+                record = record.compress_and_fill(self._metadata_checksum_type, BZ2_COMPRESSION)


While trying out the RepositoryWriter I noticed the created repositories are invalid. I think this section specifically exposes a bug in the library.

record.compress_and_fill(self._metadata_checksum_type, BZ2_COMPRESSION) returns a new record with filled in checksums while the old one (created on the line above - 603) is freed. However these two lines: https://github.com/rpm-software-management/createrepo_c/blob/master/src/repomd.c#L557-L558 make the new record use chunk from the old (freed) one. This leads to garbage values.
Could this be fixed as a part of this PR?

@kontura It might take a while before this PR is ready depending on how much time I can dedicate to it. It should probably go in a separate PR. I can maybe take a shot at fixing it next week though.

Ok, that is no problem.

I think we just need to change record->chunk to crecord->chunk for both of those lines. 🙂

dralley · 2023-10-31T05:11:28Z

@kontura If I go forwards with this, are you ok with commits 2 and 3?

kontura · 2023-10-31T05:34:17Z

Yes and maybe we could even merge metadata_compression with general_compression?

dralley · 2024-01-08T03:45:13Z

This is ready to be looked at. I don't claim it's perfect but I tried to get decent test coverage.

src/python/createrepo_c/__init__.py

dralley · 2024-01-12T20:23:17Z

@kontura ^

dralley · 2024-01-24T14:25:57Z

Bump!

kontura · 2024-01-24T14:52:01Z

@dralley sorry I am a bit swamped now, it might take a while.

src/python/createrepo_c/__init__.py

kontura

I found a couple of issues in one of the examples.

examples/python/simple_repository_writing.py

Abstracts away a lot of manual logic around setup and state tracking.

The existing API is frustratingly verbose and low-level. This will make it trivial to create repositories with only a few lines.

dralley · 2024-03-11T02:58:23Z

Updated

kontura · 2024-03-11T07:11:20Z

Thank you!

dralley · 2024-03-11T21:44:18Z

src/python/createrepo_c/__init__.py

+        self.working_metadata_files["filelists"].writer.add_pkg(pkg)
+        self.working_metadata_files["other"].writer.add_pkg(pkg)
+
+    def add_repomd_metadata(self, name, path, use_compression=True):


@kontura I didn't notice this until now (sorry), but add_repomd_metadata is a bit redundant. Should we fix that name? Or is it fine because we're adding metadata to "repomd"?

So to change it to just add_metadata?
I think I would like that a little bit more but I am also fine with the current form.

If you want to change it I believe we still can but the sooner the better.

src/python/createrepo_c/__init__.py

dralley force-pushed the new-api branch from 6850f5a to 1ea07f1 Compare May 29, 2021 21:38

dralley force-pushed the new-api branch 2 times, most recently from ef94963 to ccb8a6d Compare June 2, 2021 18:02

dralley force-pushed the new-api branch from ccb8a6d to 1eafa00 Compare August 7, 2021 03:33

dralley changed the title ~~Add experimental new high-level API prototype for creating repodata~~ Add a high-level API prototype for creating repodata Aug 7, 2021

dralley changed the title ~~Add a high-level API prototype for creating repodata~~ Add a high-level API for creating repodata Aug 7, 2021

dralley force-pushed the new-api branch 8 times, most recently from 34ec1be to e70835b Compare August 7, 2021 14:54

dralley force-pushed the new-api branch 4 times, most recently from 1b56c81 to babcaf3 Compare August 8, 2021 22:25

kontura reviewed Aug 10, 2021

View reviewed changes

dralley force-pushed the new-api branch 3 times, most recently from 1176d3e to d516d7f Compare September 4, 2021 04:42

dralley force-pushed the new-api branch from d516d7f to 67f4474 Compare November 29, 2021 21:12

dralley mentioned this pull request Aug 18, 2022

Refactor publishing code pulp/pulp_rpm#2720

Open

dralley force-pushed the new-api branch from 67f4474 to 0c560f5 Compare October 3, 2023 20:22

dralley force-pushed the new-api branch from 0c560f5 to 236e118 Compare October 3, 2023 20:26

dralley force-pushed the new-api branch from 21919a6 to 0a2edf6 Compare November 8, 2023 04:28

dralley force-pushed the new-api branch from 0a2edf6 to 2f9ba9b Compare December 4, 2023 04:03

dralley force-pushed the new-api branch from 2f9ba9b to 31221ff Compare January 8, 2024 03:41

dralley marked this pull request as ready for review January 8, 2024 03:41

dralley requested a review from kontura January 8, 2024 03:41

dralley force-pushed the new-api branch from 7e9e68f to c3acd52 Compare January 8, 2024 03:52

dralley commented Jan 8, 2024

View reviewed changes

src/python/createrepo_c/__init__.py Show resolved Hide resolved

dralley commented Jan 11, 2024

View reviewed changes

src/python/createrepo_c/__init__.py Show resolved Hide resolved

jan-kolarik self-requested a review February 1, 2024 14:37

Conan-Kudo approved these changes Feb 8, 2024

View reviewed changes

jan-kolarik reviewed Feb 15, 2024

View reviewed changes

src/python/createrepo_c/__init__.py Show resolved Hide resolved

kontura self-assigned this Mar 7, 2024

kontura reviewed Mar 8, 2024

View reviewed changes

examples/python/simple_repository_writing.py Outdated Show resolved Hide resolved

examples/python/simple_repository_writing.py Outdated Show resolved Hide resolved

dralley added 3 commits March 10, 2024 22:55

Add a high-level repository reading API

51d7058

Abstracts away a lot of manual logic around setup and state tracking.

Add a high-level repository writing API

5a4c709

The existing API is frustratingly verbose and low-level. This will make it trivial to create repositories with only a few lines.

Add tests for RepositoryReader and RepositoryWriter

f248089

dralley force-pushed the new-api branch from c3acd52 to f248089 Compare March 11, 2024 02:58

kontura merged commit a4e4c45 into rpm-software-management:master Mar 11, 2024
7 checks passed

dralley deleted the new-api branch March 11, 2024 21:31

dralley commented Mar 11, 2024

View reviewed changes

src/python/createrepo_c/__init__.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a high-level API for creating repodata #271

Add a high-level API for creating repodata #271

dralley commented May 28, 2021 •

edited

Loading

dralley commented May 28, 2021 •

edited

Loading

kontura commented Jun 2, 2021

Conan-Kudo commented Jun 8, 2021

dralley commented Jun 8, 2021

dralley commented Aug 7, 2021

kontura Aug 10, 2021

dralley Sep 4, 2021

kontura Aug 10, 2021

dralley Sep 7, 2021

kontura Sep 7, 2021

dralley commented Oct 31, 2023 •

edited

Loading

kontura commented Oct 31, 2023

dralley commented Jan 8, 2024

dralley commented Jan 12, 2024

dralley commented Jan 24, 2024

kontura commented Jan 24, 2024

kontura left a comment

dralley commented Mar 11, 2024

kontura commented Mar 11, 2024

dralley Mar 11, 2024

kontura Mar 12, 2024

Add a high-level API for creating repodata #271

Add a high-level API for creating repodata #271

Conversation

dralley commented May 28, 2021 • edited Loading

dralley commented May 28, 2021 • edited Loading

kontura commented Jun 2, 2021

Conan-Kudo commented Jun 8, 2021

dralley commented Jun 8, 2021

dralley commented Aug 7, 2021

kontura Aug 10, 2021

Choose a reason for hiding this comment

dralley Sep 4, 2021

Choose a reason for hiding this comment

kontura Aug 10, 2021

Choose a reason for hiding this comment

dralley Sep 7, 2021

Choose a reason for hiding this comment

kontura Sep 7, 2021

Choose a reason for hiding this comment

dralley commented Oct 31, 2023 • edited Loading

kontura commented Oct 31, 2023

dralley commented Jan 8, 2024

dralley commented Jan 12, 2024

dralley commented Jan 24, 2024

kontura commented Jan 24, 2024

kontura left a comment

Choose a reason for hiding this comment

dralley commented Mar 11, 2024

kontura commented Mar 11, 2024

dralley Mar 11, 2024

Choose a reason for hiding this comment

kontura Mar 12, 2024

Choose a reason for hiding this comment

dralley commented May 28, 2021 •

edited

Loading

dralley commented May 28, 2021 •

edited

Loading

dralley commented Oct 31, 2023 •

edited

Loading